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Quantum information processing and its associ- 
ated technologies has reached an interesting and 
timely stage in their development where many 
different experiments have been performed es- 
tablishing the basic building blocks. The chal- 
lenge moving forward is to scale up to larger sized 
quantum machines capable of performing tasks 
not possible today. This raises a number of in- 
teresting questions like: How big will these ma- 
chines need to be? how many resources will they 
consume? This needs to be urgently addressed. 
Here we estimate the resources required to exe- 
cute Shor's factoring algorithm on a distributed 
atom-optics quantum computer architecture. We 
determine the runtime and requisite size of the 
quantum computer as a function of the problem 
size and physical error rate. Our results suggest 
that once experimental accuracy reaches levels 
below the fault-tolerant threshold, further opti- 
misation of computational performance and re- 
sources is largely an issue of how the algorithm 
and circuits are implemented, rather than the 
physical quantum hardware. 

The prospect of an entirely new industry based on 
quantum mechanics has motivated technological devel- 
opment and led to a much better understanding of the 
principals governing our universe at the atomic scale. For 
quantum technology, experimental progress has been pro- 
nounced [1-7 . Not only has a fledgling industry based 
on quantum key distribution already emerged [8hTQ] but 
many experimental groups now routinely demonstrate 
the ability to create, manipulate and read-out multi- 
ple qubits in multiple physical systems with increasingly 
higher accuracy [11]. The goal of developing a commer- 
cially viable, large-scale quantum computer is now com- 
ing into view. Theoretical progress is also an essential 
part, and fault-tolerant quantum error correction tech- 
niques, a necessity to deal with imperfect physical com- 
ponents, have been refined substantially [T2hT4] . The 
adaptation of these techniques to the physical restrictions 
of quantum hardware has led to multiple architecture de- 
signs, indicating a clear pathway towards future quantum 
computers [T5ti23] . 

While a large-scale quantum computer is still years 
away, it is now possible to make qualitative and quan- 
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titative predictions about the performance and required 
resources of such a computer. Some of the previous pre- 
dictions consider hardware architectures based on specific 
physical systems [21, 22, 24-26 , which is an essential as- 
pect in resource analysis. However omit a full prescrip- 
tion for executing the algorithms in question. Others con- 
sider promising error-correction codes and circuits, such 
as post-selection [12] and topological error correction [14] , 
yet do so without reference to particular architectures or 
applications. Above the hardware device level, there are 
a number of layers of implementation needed to finally 
run an algorithm. By careful choice of all technologi- 
cal elements and the integration of all layers of imple- 
mentation, a complete analysis is now possible, which we 
present in this manuscript. 

A full account of the resources required for fault- 
tolerant quantum computation must consider a number 
of factors. Each physical component in our computer suf- 
fers from errors, therefore an appropriate error correcting 
code must be chosen to be compatible with the physical 
restrictions of the hardware. Physical error rates must 
be suppressed below the fault-tolerant threshold of the 
chosen code. Next, the code restricts the set of logically 
encoded gates that can be directly applied to encoded 
data. Each gate in the high-level quantum algorithm 
is then decomposed into a universal set of fault-tolerant 
primitives. To realise these universal primitives, ancillary 
states and protocols are typically required to enact tele- 
ported gates that could otherwise not be directly applied 
to the encoded data [27-29 . Each of these steps increases 
the total qubit/time overhead and must be carefully in- 
tegrated together in a way that all steps are counted. 

The precise details of how resources must be calculated 
depend on the properties of the architecture in question, 
the techniques utilised for fault-tolerant error correction, 
and the desired algorithm. In this work we will be utilis- 
ing a topological error correction code implemented on a 
large three dimensional cluster state of qubits [14 j. This 
error correction technique, despite the fact that it is the 
preferred protocol in large scale architectures, has only 
been briefly studied in regards to how a large scale al- 
gorithm is implemented. Translating an abstract quan- 
tum algorithm into the specific operations needed in the 
cluster , i.e. the development of a classical compiler, 
has only just begun. This step is anticipated to have a 
direct impact on the physical resources needed for com- 
putation. Typically, estimates consider the number of 
required gates in the high-level quantum algorithm and 
the basic amount of ancillary space needed for additional 
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fault-tolerant protocols [2TJ [22] [24] . However these esti- 
mates provide only a partial analysis. Error correction 
codes inevitably suffer from constraints that need to be 
taken into account: specifically, the interaction of qubits 
required by the actual algorithm and qubits needed for 
ancillary fault-tolerant protocols. The scheduling and 
routing of these ancillary protocols is often overlooked 
when estimating resources and are likely to dramatically 
affect resource estimates. 

By contrast, compatibility of the topological model to 
hardware architecture has been demonstrated [20 - 23 . In 
our complete analysis, we will employ an atom-optics ar- 
chitecture [20j[30], which is based on the photonic module 
[31]. The photonic module is a relatively simple device 
that allows an atomic qubit to mediate the generation of 
photonics entanglement. The 3D cluster state to support 
topological error correction will then be created by an ar- 
ray of these devices. Decomposition of each logical gate 
into a series of physical operations in this architecture is 
clear, and hence all the geometry and connectivity con- 
straints at the logical and physical level can explicitly be 
included in the analysis. 

The desired algorithm, Shor's algorithm, is a com- 
paratively simple application compared other problems 
solvable by a quantum computer [26, 32]. More impor- 
tantly, it has a rich history of theoretical development 
and explicit circuit constructions. Hence we can choose 
an circuit construction amenable to the system design de- 
fined above. However, to run the circuit, we still have to 
take the geometric constraints at the logical level into ac- 
count. Even though scheduling analysis at the physical 
level is taken care of by the topological quantum com- 
puter model, scheduling and arrangement of gates and 
ancillary operations within the logical space created by 
the topological cluster impact performance. This step 
is largely unexplored, and leaves huge room for optimi- 
sation. We should remember that circuit optimisation 
needs to be done with such restrictions in mind. The 
ability of an error corrected system to realise the optimal 
circuit size at the logical-level is dependent on adapt- 
ing to these constraints, hence estimates should be made 
with care. 

With this given computational system, the number of 
photonic modules and the time required to execute the 
algorithm as a function of the problem size and physical 
error rates desirably characterize the computer. As it 
is designed, the analysis explicitly deals with all aspects 
of the error corrected algorithm from the bottom device 
layer to the top abstract algorithm, giving a unique, but 
standardized estimation method. 



I. PRELIMINARIES 

In the topological cluster state model a three- 
dimensional cluster forms the effective Hilbert space in 
which computation takes place [TJJ [33]. The photonic 
cluster state is continuously prepared from non-entangled 



photons by the hardware. 

Logical qubits are introduced as pairs of defects in the 
cluster. Defects are created in the cluster by measuring 
physical qubits that define the defect in the Z basis [14] . 
An entangling gate is realised by braiding pairs of de- 
fects. Logical errors occur when chains of physical errors 
connect or encircle defects, which is made less likely by 
increasing the circumference of defects and by increasing 
their separation. Physical qubits in the bulk of the clus- 
ter, those not associated with defects, are measured in the 
X basis. This reveals the endpoints of chains of errors, 
from which the most likely set of errors can be inferred. 
To estimate physical resources, we are ultimately inter- 
ested in the size of the three-dimensional cluster state 
required to execute Shor's algorithm. 

As the algorithm is executed at the logical level, it is 
useful to introduce a scale factor that essentially encap- 
sulates the overhead associated with error correction [14] . 
A logical cell is defined as a three-dimensional volume of 
the cluster that has an edge length of d + d/A unit cells, 
where d is the distance of the error-correction code. De- 
fects have circumference of d unit cells and are separated 
by d unit cells [Fig. [I]. 



A. Shor's Algorithm 

We now turn to the circuit for Shor's factoring algo- 
rithm. There are a number of different circuit imple- 
mentations of the algorithm [34-37 , which assume that 
arbitrary sets of qubits can be simultaneously entangled 
without any penalty related to their separation. In the 
topological model, as gates are realized by braiding de- 
fects, one could implement a gate over a long distance 
without any penalty. However, multiple gates are typ- 
ically implement at the same time step and necessary 




FIG. 1: A logical cell; an error correction independent mea- 
sure of the size of topological quantum circuits. The lengths 
are expressed in terms of unit cells of the cluster state. The 
qubit defect is the coloured region centred within the cell. 
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scheduling within the topological cluster yields nontriv- 
ial overhead. 

An easier approach is to modify an existing circuit so 
that it only requires nearest-neighbour gates in some re- 
stricted geometry. This can be done by adding SWAP 
gates to the circuit [38-40 . The Beauregard circuit 
[371 [38], which we employ in this manuscript, is a Linear 
Nearest-Neighbour (LNN) construction [38]. This circuit 
is not as efficient as others, but its explicit LNN construc- 
tion means we can apply it directly to the topological 
cluster without further modification. With logical qubits 
arranged in a line, the circuit to factor an L-bit number 
requires Q = 2L qubits and has depth K = 32L 3 , to 
the leading order. The circuit is not inherently robust 
to errors [41 , requiring an error rate per gate approxi- 
mately, 1Q- 1 /KQ = lO" 1 /64L 4 , ensuring a 90% chance 
of success. 



B. Gate decomposition 

As with all error corrected models of quantum compu- 
tation, not all gate operations can be directly applied 
in a fault- tolerant manner. At the logical level, only 
preparation of the states |+) and |0), X and Z gates, 
measurement in the X and Z bases and the cnot gate 
can be directly applied. SWAP gates are achieved by de- 
forming the trajectory of the defects with which they 
are associated. To complete a universal set we add the 
R z (tt/8) and R z (7r/4) rotations [13] . To apply these gates 
we perform a teleported gate using the ancillary states 
\A) = (|0) + e i7r / 4 |l))/ v / 2 and \Y) = (|0)+z|l))/>/2. Each 
time we attempt the R z (jr/8) gate, there is a 50% chance 
that a R z (tt/4) correction is required. 

To ensure that the error rate of the R z rotations are 
sufficiently low, the states \A) and \Y) must be of suf- 
ficient fidelity. As these ancillary states are prepared in 
the cluster via injection protocols [13] . state distillation is 
used to increase the fidelity of the ancilla state [29] , con- 
suming multiple \A) or \Y) states with a lower fidelity. 
This process can be concatenated until the desired fi- 
delity is reached. If pi is the error probability on the state 
after / levels of state distillation, thenp^ 1 = 35(pf) 3 and 
pY+i = 7(pY) 3 for \ A) and \Y) respectively [29]. Each dis- 
tillation circuit is probabilistic with a failure probability 
of 0(p). 

Given our set of logical gates, which now includes the 
R z (tt/8) rotation, we need to decompose the circuit for 
Shor's algorithm into a sequence of these fault-tolerantly 
implemented gates. For an upper bound on the number 
of gates needed we will (pessimistically) assume every 
gate is a non-trivial phase rotation that must be approx- 
imated by a sequence of logical gates found using the 
Solovay-Kitaev algorithm [27], and each gate in this se- 
quence is the R z (tt/8) rotation, which is most resource 
intensive amongst our logical gates constitutng over 50% 
of the decomposition [42] . Numerical results suggest that 
a sequence of A = 19.6 log(l/e) — 10.5 gates is required to 



achieve an arbitrary single qubit rotation with accuracy 
e [42]. Hence, to achieve the required error rate, each 
logical gate in the circuit can be estimated as a sequence 
of A = 19.6 log(640L 4 ) - 10.5 gates. 



II. RESULTS 
A. Braided circuits 

We now translate the decomposed circuit for Shor's al- 
gorithm to a sequence of braids in the three-dimensional 
cluster state. As each gate in the algorithm is assumed 
to be a R z (ty/8) rotation, this is the logical gate that 
will be designed. Shown in Fig. [2] is the braiding se- 
quence for the logical R z (tt/8) rotation at one and two 
levels of concatenated state distillation. Full details of 
these gate constructions are detailed in supplementary 
material. The braiding sequence is compressed manually 
into a cuboid such that they can be stacked tightly in 
the spatial and temporal directions in the cluster. The 
algorithmic qubits (the ones specified in the Beauregard 
circuit) are the green defects (two defects per algorith- 
mic qubit occupying a cross sectional area of two logical 
cells). Immediately above each algorithmic qubit is an 
empty region of the cluster, this empty space is utilised 
for braided logic and SWAP gates required by the Beaure- 
gard circuit. The linear nearest neighbour design of the 
original circuit ensures that no further optimisation is re- 
quired at the algorithmic level and that the defect layout 
of algorithmic qubits in the cluster is sufficient to realise 
the depth of the original circuit. Above this empty region 
is the distillation space for |Y) states, required to imple- 
ment a R z {tt/4) correction gate for each applied R z (n/8) 
gate and Hadamard operations. Below the algorithmic 
qubits is the distillation space for \A) states. 

At one level of concatenation, each algorithmic qubit 
has a dedicated \A) and |Y) state distillery. As the al- 
gorithmic layer is linear, these distilleries connect from 
above and below in the cluster (direct connections in the 
topological model correspond to teleported gates [T3]). 

For two levels of concatenation the repeating cuboid 
encapsulates four algorithmic qubits. The first concate- 
nation level has physical injection points for low fidelity 
\A) and \Y) states and the size of the defects are half of 
what is required at the algorithmic layer, this reduced 
size and separation of defects for the first concatenation 
level is because distillation circuits have a residual er- 
ror. Therefore if the error of an injected state at the 
physical level is O(10 -3 — 10 -4 ), then implementing full 
strength error correction for these circuits is redundant. 
The residual error from distillation will always dominate. 
At the second layer of concatenation, the residual error 
becomes commensurate with the required logical error 
needed for computation. Therefore, after the first layer 
of concatenation, defects are expanded and separated to 
the same size as the required error correction for the al- 
gorithm. Additionally, at the second level of concatena- 
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FIG. 2: Explicit braiding constructions for a R z (tt/8) rotation in the topological cluster at a) one and b) two levels of 
concatenated state distillation. The temporal axis in the cluster is illustrated. For a detailed explanation of these constructions 
see the supplementary material. Qubits that are part of the algorithmic circuit for Shor are illustrated in green. The cluster 
volumes and depths for these two circuits are V = {210, 1386} and D — {5, 9} respectfully. Each sequence is designed such 
they can be stacked together efficiently in either the temporal or spatial directions in the cluster. 



tion, the state injection for corrective \Y) states, needed 
for the \A) state distillation, becomes level one |Y) state 
circuits, placed in the relevant free space in the cluster. 

The application of corrective R z {it/4) rotations for \A) 
state distillation and the probabilistic nature of the cir- 
cuits themselves are compensated at the second level of 
concatenation by utilising free space to add extra distil- 
leries [See supplementary material]. At the first level of 
concatenation, for \Y) states, there is sufficient space for 
one extra circuit adjacent to the second level circuit, to 
compensate for any one failure at level one. For \A) state 
distillation there is space for two extra circuits within the 
cuboid to compensate for a given circuit failure. These 



circuit failures occur with probability 0(p), with p the 
fidelity of the injected states. Given the extra space for 
spare level one circuits and assuming p is O(10 -3 — 10 -4 ), 
we will have too many failures at level one with a proba- 
bility 0(1(T 5 - 1(T 7 ) for \Y) states and 0(1(T 7 - 1(T 9 ) 
for \A) states. Therefore, we expect that we will not 
have sufficient first level states every approximately 10 5 - 
10 7 Logical gates. While these failures result in an in- 
crease in circuit depth, they occur infrequently enough 
to be neglected. Finally, a total of 15 first level circuits 
for corrective \Y) states, needed by the second level \A) 
state circuit, are used. The probability that not enough 
level one \Y) states are available is given by (i.e. all 
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FIG. 3: Layout of logical qubits for Shor's algorithm, including 
\A) and \Y) distillation. 



15 R z (tt/8) corrections are needed and a single level one 
\Y) distillation failure occurs), and if p is O(10 -3 ) this 
is also of O(10 -7 ). Corrective \Y) states, needed with a 
probability of 0.5 for the logical R z (tt/S) gate are located 
above the algorithmic layer. 

The total logical volume of cluster for one and two lev- 
els of state distillation can be calculated explicitly. For 
one level of concatenation, each R z (tt/8) gate occupies a 
volume of V = 5 x 21 x 2 cells with a depth along the 
temporal axis of the cluster of D = 5 cells and a cross 
sectional area of A = 21 x 2. For two levels of concatena- 
tion the volume is V — 8x7 J x9 = 1386 where the factor 
of four accounting for the fact that the cuboid represents 
four gates. The number of cells along the temporal axis 
is D = 8 and a cross sectional area of A = 77 x 2. 



B. Cluster Volume 

To determine the total size of the cluster state, we 
need to know the amount of error correction and state 
distillation required. Each logical gate requires A x V 
logical cells. Hence, the failure probability of such a gate 
needs to be, 



AV 



< 



640L 4 ' 



(1) 



where pf is the error rate of a logical cell and the right 
hand side sets the target error rate for gates in the circuit 
for Shor's algorithm. For standard depolarizing noise, we 
can estimate the failure of a single logical volume of the 
cluster as, pf ~ C\ (CivlVth) ^ d+1 ^ 2 -' , where d is the dis- 
tance of the code, p is the physical error rate, p t h is the 



threshold error rate, which is estimated to be approxi- 
mately 0.62% and d « 0.13 and C 2 ~ 0.61 [II S3]. 
Assuming that pf <C 1 and 1/640L 4 <C 1, the distance 
required to achieve the target error rate is 



d > 



2 log (640CiL 4 AV) 

iog(m) - iog(c 2 p) 



(2) 



Here we assume that the residual error after state distil- 
lation is below the error rate of a logical cell, such that 
7 (3 z -i)/2 p 3 z < pf and 35 (3*-i)/2 p 3* < pf f()r \ Y ) states 

\A) states respectively. These conditions determine the 
level of state distillation required. Only for very large L 
or for high values of p does state distillation require a 
maximum of three concatenated levels. The volume and 
depth at this level was extrapolated from the level two 
circuits at V = 10000 and D = 15. 

Finally, we can specify the properties of the entire clus- 
ter state. The cluster contains 4L x A logical cells. The 
total cross-sectional area of the cluster is 5Ld x 5cL4/4 
physical unit cells. The third dimension of the clus- 
ter represents the temporal axis and its size determines 
the computational time. The depth of a single logical 
gate is A x D and the depth of a single R z (tt/S) gate is 
AD x (5d/4). Therefore, the total depth of the cluster is 
(32L 3 AL>) x (5d/4). 



C. Physical Resources 

In the architecture, photonic modules are used to pre- 
pare the cluster state and also to initialize and measure 
single photons [30]. There is a one-to-one mapping be- 
tween the cross sectional size of the 3D cluster and the 
number of required modules. For a cluster with a cross- 
sectional area of Ni x N 2 physical unit cells, a total of 
(2Ni + 1)(2N 2 + 1) optical lines are present, half require 
two modules for photon detection and half require four. 
All optical lines require one module as a probabilistic 
source. The number of modules required to prepare the 
cluster state is 2(iVi + 2)(N 2 + 1) + 2(N 2 + 2)(JVi + 
1) [20]. This gives a total number of modules equal 
to (12 + 14A/i + UN 2 + 20A/iN 2 ), with N\ = 5Ld and 
N 2 = 5<i/4A In addition to the number of modules, we 
can specify the physical size of the computer and its run- 
time. The dimensions of the computer are S x = 5LdM 
and S y — bdMA/A, where MxMis the surface area of 
a photonic module (with depth < M) [20]. The physi- 
cal depth of the computer is S z < 2Tcf, where Cf the 
speed of light in fiber. This depth is governed by the op- 
tical lines that recycle photons from the detectors to the 
sources [30]. The time required to run the algorithm is 
32L 3 AD x 5d/4 x 2T, where T is the time required to pre- 
pare a single layer of the cluster state [20] , corresponding 
to the operational speed of the photonics module. 

Figure [4] shows the runtime of the algorithm, the total 
number of photonic modules and the dimensions of the 
computer as functions of the physical error rate and the 
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problem size. Here we have assumed that p t h — 0.62% 
PHE3], M = 10mm and T = 10 ns [44]. Contour lines 
in Figure. [4] indicate where the time to completion is one 
year, when the total number of photonic modules is 1 bil- 
lion, and when the cross sectional dimensions are 100m. 
With an error rate an order of magnitude below p t h, the 
largest problem size that can be completed within a year 
is L « 820. 



III. DISCUSSION 

The current record for factoring general integers is 
L = 768 [45 , hence as anticipated, these results show 
the superiority of quantum computation. However, at 
the same time, they seem not to demonstrate a significant 
increase in the processing power of quantum computers. 

Our results give a comfortable upper bound for the 
resource requirements using explicit constructions in the 
topological model. The time required to factor a 1024- 
bit number in this analysis is 2.15 years with 1.9 billion 
photonic modules, required to prepare the cluster. An 
interesting question to ask here might be how these num- 
bers can be compared with the fundamental circuit used 
in this analysis. The basic circuit requires a computa- 
tional depth of 32L 3 and 2L qubits. For physical gate 
times of 10ns, for L = 1024, the error correction over- 
head is 2.3 x 10 T temporally and 9.4 x 10 5 in terms of 
qubits/modules. These numbers are based on a phys- 
ical error rate an order of magnitude below threshold. 
This overhead, resulting from the error correction, can 
potentially be significantly reduced by optimisations un- 
related to the fundamental hardware. This can be easily 
highlighted by the fact that decreasing the error rate by 
an order of magnitude results in a speed-up of to 1.14 
years. The same speed-up can be achieved by compact- 
ifying the topological circuits shown here by 44% along 
the temporal axis of the cluster. 

There has been many other resource estimates made 
for a computer employing both concatenated and topo- 
logical coding models. Thaker et al. estimated that to 
factor a 1024-bit number using an architecture based on 
trapped ions would take around 25 days [24]. Van Me- 
ter et al. estimated a 2048-bit number on a distributed 
architecture based on quantum dots would take around 
400 days [2T.. Jones et al. recently improved the latter 
estimate to around 10 days [22] by utilising a monolithic 
array of dots and increasing the speed of fundamental 
error correction cycles. New results in superconducting 
designs suggest a factoring time, for a 2000-bit number, 
slightly less than one day [46]. In these estimates differ- 
ences arise due to how the algorithm is implemented. Un- 
til a complete analysis is performed, it is meaningless to 



directly compare them. In particular, more resource ef- 
ficient techniques are utilised in these results which need 
to be explicitly integrated within the topological model 
for future estimates. 

All resource estimates, including ours, illustrates that 
large fraction of the overhead arises from the need to 
prepare ancillary states. Other results assume sufficient 
space within the computer such that ancillary protocols 
can be completed rapidly enough that the depth of the 
algorithmic circuit is unchanged. This could be of signif- 
icant benefit. However, the appropriate routing of these 
ancillary protocols need to be explicit. How distillation 
circuits are interfaced with data qubits needs to be de- 
tailed and exactly which protocols are utilised needs to 
be analysed. Estimates from Refs. [22, 46 use the most 
optimal circuit for Shor's algorithm [SUET]. This circuit 
has not yet been adapted to the geometric constraints of 
the topological cluster. Until an appropriate construc- 
tion is presented for the topological cluster it is difficult 
to assume that the circuit size will remain unchanged. If 
such a circuit design is presented, then we anticipate im- 
mediate reductions in resources. Previous results also as- 
sume that various subcomponents of a fault-tolerant im- 
plementation can be applied without space/time penalty. 
There has been many results published optimising vari- 
ous components in a fully error corrected quantum algo- 
rithm [48ti5Tj . However, each of these results have been 
derived in isolation, some have not been converted into 
the topological model and none have been carefully inte- 
grated together. This is the primary challenge of topolog- 
ical computation. Subcomponents may be efficient, but 
the success of a large-scale computation requires delicate 
integration. Our results illustrate that there is a signifi- 
cant gap between optimistic resource estimates and those 
performed using explicit circuit contractions. 

It is clear that before a quantum computer is actually 
build that algorithmic compilation is a necessity. Re- 
ducing the burden on experimental development is ulti- 
mately a function of how we realise abstract algorithms. 
This analysis illustrated that there is much work to be 
done. While the topological model is promising, its ul- 
timate success is dependant on continual efforts to inte- 
grate all necessary protocols in a way that minimises the 
number of devices and the time required to execute an 
algorithm. 
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error rate p. The discontinuities represent points where the concatenation level for state distillation increases. 
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V. APPENDIX 

Here we detail the component constructions required 
to build a logical R z (tt/S) gate, which forms the basis 
for the resource estimates illustrated in the main text. 
The goal of these braid constructions is to minimise the 
total logical volume to implement the fault-tolerant gate 
and to ensure that braids are constructed in a manner 
compatible with the underlying circuits and in such a 
way that they can be packed densely within the overall 
topological cluster. The techniques used in this section to 
achieve compact braiding utilises results from Ref. [14] 
and Ref. [48]. 



A. Primitive operations 

First let us introduce the primitive fault-tolerant op- 
erations that are allowed in the topological model. We 
introduce five types of gates; measurement, initialisation, 
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state injection, the two-qubit cnot and the teleported 
phase rotation, R z (0), = {J, |}. These are illustrated 
in Fig. [5) 



VI. COMPACTIFIED DISTILLATION 
CIRCUITS 



With these primitive operations, the standard circuits 
for \A) and \Y) state distillation can be converted to a 
compactified braiding sequence. Shown in Fig. [6] are the 
canonical circuits for distillation. For the \Y) state, one 
half of a logical Bell pair is encoded with the [[7,1,3]] 
error correcting code, after which a transversal R z (tt/4:) 
gate is applied to the encoded half and measured in the 
X basis. For \A) state distillation half of the Bell pair 
is encoded with the [[15,1,3]] Reed-Muller code and a 
transversal R z (tt/S) gate applied prior to logical mea- 
surement. Given the correct set of measurement results 
a purified copy of the \Y) or \A) state is teleported to the 
output half of the original Bell state. In Ref. [48] it was 
shown how the canonical versions of the braided logic can 
be compactified using previously known techniques and 
a process known as defect bridging. Illustrated in Fig. 
[7] are the compactified versions of the circuits shown in 
Fig. 6 In Fig. 7k) we show the compactified version 
of the] A) state distillation circuit. The sets of coloured 
pyramids represent the injection and gate teleportation 
needed to realise the transversal R z (tt/S) and corrective 
R z (ty/4) gates applied to the encoded half of the initial 
Bell state, with colour coding matching Fig. [6| Imbedded 
within the defect structure is the logical Z measurement 
present in the teleported gate circuit and this logical mea- 
surement result dictates if a further corrective R z (7r/4) 
rotation needs to be applied (again via injection and tele- 
portation, this time using a \Y) state). For this circuit, a 
strictly enforced temporal axis is needed because the log- 
ical measurement of the first injection and teleportation 
dictates if the second one needs to be applied. 

In Fig. [7)3) and c) we show the compactified version 
of the \Y) state distillation circuit. Fig. [7)3) illustrates 
the compact version, using known techniques excluding 
bridging, while Fig. [7J3) illustrates the final version af- 
ter defect bridging. We present both circuits as they will 
both be used in a concatenated distillation sequence. Un- 
like the \A) state distillation circuits, there is no strict en- 
forcement of a temporal axis in the cluster. Although the 
transversal R z {ir/4) operation for \Y) state distillation is 
also probabilistic for each teleported gate, the correction 
operation is a simple Z gate which can be applied via 
appropriate classical tracking of the Pauli frame. In all 
three versions of the circuit, the relevant output defects 
are shown with black caps. 



VII. LEVEL 1 CONCATONATED GATE 

The level one concatenated R z (tt/S) gate is relatively 
simple to construct, primarily because all injection points 
correspond to physical qubits in the topological cluster. 
For each algorithmic qubit in the computer, the region 
below is devoted to \A) state distillation while one logi- 
cal cell above contains empty cluster to enable SWAP and 
CNOT operations between algorithmic qubits. The region 
above this layer is devoted to \Y) state distillation. Un- 
like higher levels of concatenation, a single R z (tt/8) gate 
can be defined which can be repeated along the spatial 
and temporal axes of the cluster. Fig. [8^l) illustrates 
the complete gate which has a depth along the temporal 
axis of D = 5 and a cross sectional area in the cluster of 
A = 21x2. The algorithmic qubit is idle until the distilla- 
tion operations are complete and with a 50% probability, 
the corrective R z (ty/A) gate need not be applied. At one 
level of concatenation, all defects in the circuit have the 
same size and separation. For a level one concatenated 
circuit there is no extra space for distillation circuits to 
compensate for a failure event in the circuit itself. As the 
logical error rate required by the computer at one level 
of concatenation is high (for an experimentally feasible 
physical gate error), the total number of gates per logical 
time step will be quite comparatively small and hence the 
probability of a failed distillation circuit per logical time 
step is quite low. Therefore, in the even that a distilla- 
tion circuit fails, this structure would be repeated. In our 
analysis we assume that all gates are R z (tt/8) rotations, 
however in reality this is not the case. In the event of 
an occasional failure, a repeated distillation circuit can 
be performed in cluster spaces otherwise vacant due to 
other logic operations during computation. 



VIII. LEVEL 2 CONCATONATED GATE 

Forming a second level concatenated R z (tt/8) gate is 
significantly more complex. This is due to injection 
points at the second level coming from outputs of level 
one circuits. From Fig. |6^i) you can see that accessibil- 
ity to the I A) state injection points within the braiding 
structure is quite limited. As a well defined temporal axis 
has to be maintained, we need to modify the \A) state 
circuit such that these injection points can be connected 
easily to the level one outputs. The following sequence 
of images illustrates the deformations. 

These deformations allow us to access the 15 injection 
points that will use the output from the level one distilla- 
tion circuits. Note that the temporal axis of the circuit is 
still well defined as the injection points for the corrective 
R z (-k/4) gates occur after the transversal R z (tt/8) gates. 

We can now combine the structure in Fig. [13] with 
15 copies of level one distillation circuits. As noted in 
the main text and introduced in Ref. [48] . because of 
the residual error associated with distillation, the error 
correction associated with level one does not have to be 
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a ) Primal Defect Initialization and Measurement 

state injection 




Z Basis 



Initialization 



X Basis 




Temporal Axis 



c) Primal/Primal CNOT 



Target Out 



Control In 




Target 



Temporal Axis 



b) Dual Defect Initialization and Measurement 

state injection 




X Basis Initialization 



Z Basis 




Temporal Axis 



d) ^({71-/4,71-/8}) Teleported Gate 




Temporal Axis 



FIG. 5: Examples of basic operations that are used to construct braid sequences, a) Primal defects in a horseshoe shape are 
used to prepare a logical qubit in the state |0) and to measure a logical qubit in the Z basis. An arbitrary state can be prepared 
by measuring one of the physical qubits (shown in pink) in a rotated basis as the defects are created, b) Similarly, dual defects 
in a horseshoe shape are used to prepare a logical qubit in the state |+) and to measure a logical qubit in the X basis, c) A 
CNOT gate can be achieved by braiding a pair of dual defects (prepared in the state |+)) with three pairs of primal defects (the 
control qubit, the target qubit and an extra qubit prepared in the state |0)) ^4]. d) A teleported Z rotation can be achieved 
by attaching the relevant ancillary state to the data qubit [14]. 



as strong as level two. The size and separation of de- 
fects at level two must match up with the error correction 
strength at the algorithmic level, however the strength of 
error correction for the level one distillation circuits only 
needs to be as strong as the residual error associated with 
the circuits themselves. Hence at level one we reduce the 
size and separation of defects by a factor of two. This 
reduction in required error correction at level one allows 
us to stack 17 copies of Fig. [6^i) along the input edge 
of Fig. 
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and form the appropriate input/output con- 
nections. The fact that there is space to stack 17 copies 
of the level one distillation circuit helps us to protect 
agains distillation failure at level one. The failure of the 
first level to produce enough states requires three distil- 
lation circuits to fail, which occurs with a probability of 
order (^)p 3 , which for p = 1CT 3 is O(10 -7 ). Therefore, 
a failure at the first level of concatenation will not occur 
at every logical time step and only in certain rare times 
will an algorithmic qubit have to wait until a level one 
distillation circuit is redone. 



After defects are outputted from the level one circuits 
they are expanded to full error correction strength and 
attached to the level two circuit at the appropriate points 
(in Fig. 15 we have removed the pyramid structures at 
the injection points of the level two circuit, but retain 
the colour coding) . This expansion needs to be done 
carefully. While the level one circuit can have a sepa- 
ration between defects half that of the level two circuit, 
the separation of level one defects and level two defects 
must be the same as separations within the level two cir- 
cuit. This is because error chains can begin on a level 
two defect and terminate on a level one defect. 

The final part of this circuit is the corrective R z (tt/4) 
operations that may need to be applied at the second 
level of concatenation. Unlike level one circuits, these 
gates need to utilise \Y) states that have been distilled 
to level one. Given the compact nature of the level one 
\Y) state circuits [Fig. [6]:)], there is sufficient space adja- 
cent to the relevant injection points to place 15 circuits, 
one for each possible R z {tt/4) corrections of the second 
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FIG. 6: Quantum circuits required for distillation of the states 
a) \Y) and b) \A). The coloured boxes represent where error 
prone states are injected into the cluster. This solar coding 
corresponds to the coloured injection points in braiding dia- 
grams throughout this paper. 



level | A) state circuit, Fig. 16 illustrates. The connection 



points of each output of the level one \Y) state distilla- 
tion is expanded to the appropriate separation and size 
before being joined onto the level two circuit for \A) state 
distillation. As with level one distillation circuits for 
| A) states, the distillation circuits for corrective R z {it/4) 
states can fail. With this defect arrangement, the lead- 
ing order failure channel is when a single \Y) state circuit 
fails and all 15 correction gates need to be applied. This 
probability is given by, |ff , which for p = O(10 -3 ) is 
O(10 -7 ), again ensuring that additional time will only 
be needed in the computer every O(10 T ) logical gates. 



The braiding structure of Fig. [16] now allows for the 
application of an encoded R z (tt/8) gate at two levels of 
concatenated distillation, but there are still two more 
things to consider. While we have introduced 15 copies 
of level one |Y) state distillation for the correction of the 
second level \A) state circuit, we still require a level two 
distilled \Y) state in order to apply a possible correction 
gate to the final R z {it/4) rotation. As with the level 
one R z (tt/8) gate, this distillation is performed above 
the algorithmic layer in the cluster. 

The second level distillation circuit for \Y) states is 
shown in Fig. [l4j Note that we have used two different 
circuits for level one distillation [Fig. [6^)] and level two 
[Fig. [6}})]. We have not attempted to perform further 
compression of the level two circuit manually as its vol- 
ume is sufficiently small as to not impact the depth of 
the overall R z (tt/8) gate. 

This circuit can now be incorporated into the larger 
structure, with the appropriate SWAP space left between 
the algorithmic layer and the \Y) state distillation layer. 
Along with the second level \Y) state distillation circuit, 
we have illustrated where an additional first level circuit 
can be placed in order to compensate for the possibility 
that one of the seven, first level circuits fail. These failure 
are also compensated by the fact that the final second 
level \Y) states are only needed 50% of the time. Hence 
this circuit element, utilised every time a R z (n/8) gate 
is applied, over supplies distilled \Y) states. 

From Fig. [lT| the last issue to solve should be clear. 
The space utilised for \A) state distillation occupies a 
cross sectional space in the lattice equal to four algorith- 
mic qubits. Therefore, we need to duplicate the distil- 
lation structure vertically in order to produce sufficient 
states to serve these four algorithmic qubits. Stacking 
three additional copies of the \A) state distillation cir- 
cuits below the one in Fig. [TT] gives us the stackable 
braiding sequence which enacts R z (tt/8) gates over four 
algorithmic qubits. This leads to the final structure il- 
lustrated in the main text. 

Implicit in these images (and throughout the discus- 
sion) is the possibility of dynamical configuration of these 
structures if level one distillation circuits fail. These di- 
agrams assume that all distillation circuits output suc- 
cessfully and can be connected as shown. If this is not 
the case, a reconfiguration of the overall circuit is needed. 
The design of these reconfigured circuits can be done of- 
fline, but their actual application will be chosen dynam- 
ically as the computation is run. As discussed, we have 
given sufficient space within the second level structure to 
ensure that a total failure (i.e. one where we do not have 
sufficient distilled states at the first level of concatena- 
tion) does not occur at every time step of computation. 
However, it is expected that at least one circuit will fail 
throughout the computer, at every logical time step, that 
can be compensated by these extra resources. The dy- 
namical reconfiguration is not expected to increase the 
depth of these fault-tolerant gates in a significant way, 
but do still need to be calculated. Given the large num- 
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FIG. 7: Braiding diagrams for a) \A) state distillation and b), c) \Y) state distillation. These compactified circuits are from 
Ref. [33]. We illustrate two designs for \Y) state distillation as the circuit in Fig. B) will be utilised at the second concatenation 
level. Coloured pyramid structures represent state injection points to implement the R z (ir/4) and R z (tt/8) gates needed in the 
distillation circuits. 

ber of possible failure points the specification of all pos- 
sible configurations will need to be done in an automated 
manner and is the focus of future work. 
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FIG. 8: R z (tt/8) gate at one level of concatenated distillation. Fig. A) shows the connected circuit, including \A) state 
distillation, \Y) state distillation for the corrective R z (tv/4) operation and the (green) algorithmic qubit. Fig. B) illustrates 
each of the three components. The temporal axis through the cluster is illustrated. 
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FIG. 9: The circuit from Fig. [6Jl) is rotates 90 degrees around 
the injection points. This opens access to these junctions from 
the input side. Three injection points (White, translucent 
pink and red) remain difficult to access due to primal defects. 




FIG. 10: The primal defect strand near the red injection point 
is rotated, giving input access to the injection point 
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FIG. 13: The primal defect strand blocking access to the 
translucent pink injection point is deformed along the right- 
most primal defect strand creating and then further deformed 
to the left. This gives access to the injection point from the 
input side. 




FIG. 14: Second level distillation circuit for \Y) states. In 
order to have clean access to the seven injection points at 
level two, a less optimised version of the circuit is used. This 
structure does have the ability to be compressed further, how- 
ever the majority of resources needed by the R z (tt/8) gate is 
dedicated to \A) state distillation and consequently we do not 
compact this circuit further. Space in the structure also ex- 
ists for an additional level one distillation circuit to protect 
against circuit failure at level one. This circuit can connect 
without penalty by a reordering of qubits in the second level 
circuit and connecting the auxiliary circuit to the output side. 



16 




FIG. 15: Connecting multiple first level distillation circuits to the modified second level distillation circuit. Before the output of 
level one is connected to the relevant injection points at level two the defects are expanded to the full strength error correction 
needed at the algorithmic level. The first level distillation circuits can utilise a smaller error correcting code as the residual 
error from the distillation circuit will be higher than the required protection of the data qubits. As such, the first level defects 
must maintain a separation from the second level circuits compatible with the strength of error correction at level two. 
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FIG. 16: 15 copies of level one \ Y) state distillation are introduced into an empty cluster region to provide corrective operations 
to the | A) state distillation circuit at second level. These circuits can utilise smaller defects as they are level one circuits. 
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FIG. 17: Complete structure for the R z (tv/8) rotation for one of the four algorithmic qubits present in the repeating unit (we 
have reversed the temporal axis in this image for readability). Two levels of \A) state distillation, with necessary correction 
gates sit below the algorithmic layer and two levels of \Y) state distillation exists above the algorithmic layer for the final 
R z (tv/4) correction, required 50% of the time. There are additional redundant circuit elements that are included to protect 
against the failure of level one distillation circuits. The connection structures illustrated here assume no such failures occur. 
This circuit would have to be modified dynamically depending on the result of certain measurements. 



